English-Czech Machine Translation Using TectoMT
نویسنده
چکیده
English to Czech machine translation as it is implemented in the TectoMT system consists of three phases: analysis, transfer and synthesis. The system uses tectogrammatical (deep-syntactic dependency) trees as the transfer medium. Each phase is divided into so-called blocks, which are processing units that solve linguistically interpretable tasks (e.g., statistical part-of-speech tagging or rule-based placement of clitics). This paper shortly introduces linguistic layers of language description which are used for the translation and describes basic concepts of the TectoMT framework. The translation results are evaluated using both automatic metric BLEU and human judgments from the WMT 2010 evaluation.
منابع مشابه
New Language Pairs in TectoMT
The TectoMT tree-to-tree machine translation system has been updated this year to support easier retraining for more translation directions. We use multilingual standards for morphology and syntax annotation and language-independent base rules. We include a simple, non-parametric way of combining TectoMT’s transfer model outputs. We submitted translations by the Englishto-Czech and Czech-to-Eng...
متن کاملWays to Improve the Quality of English-Czech Machine Translation
This thesis describes English-Czech Machine Translation as it is implemented in TectoMT system. The transfer uses deep-syntactic dependency (tectogrammatical) trees and exploits the annotation scheme of Prague Dependency Treebank. The primary goal of the thesis is to improve the translation quality using both rule-base and statistical methods. First, we present a manual annotation of translatio...
متن کاملTectoMT: Highly Modular MT System with Tectogrammatics Used as Transfer Layer
We present a new English→Czech machine translation system combining linguistically motivated layers of language description (as defined in the Prague Dependency Treebank annotation scenario) with statistical NLP approaches.
متن کاملTectoMT – a Deep-Linguistic Core of the Combined Chimera MT system
Chimera is a machine translation system that combines the TectoMT deep-linguistic core with phrase-based MT system Moses. For English–Czech pair it also uses the Depfix postcorrection system. All the components run on Unix/Linux platform and are open source (available from Perl repository CPAN and the LINDAT/CLARIN repository). The main website is https://ufal.mff.cuni.cz/tectomt. The developme...
متن کاملCzechizator - Čechizátor
We present a lexicon-less rule-based machine translation system from English to Czech, based on a very limited amount of transformation rules. Its core is a novel translation module, implemented as a component of the TectoMT translation system, and depends massively on the extensive pipeline of linguistic preprocessing and postprocessing within TectoMT. Its scope is naturally limited, but for s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010